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VirtualWliner Software Requirements Specification 

1 Introduction 

VirtualMiner is a software component that will enable users of various kinds of 
data to view, filter, query and mine their data easily. The target audience consists 
of "lay" users of data (as opposed to trained statisticians and data analysts). 
VirtualMiner consists of two parts, the Analyzer and the Viewer. Used in 
conjunction, the two components provide users with an easy and intuitive way of 
interacting with and discovering knowledge in their data. 

In a multi-user scenario, such as an intranet/Internet, the Analy^^er would typically 
be used by a data analyst/webmaster to prepare the raw data for analysis. Once 
this has been done, the end-user can then use the Viewer to view, fitter and 
query the data, and view data mining results. 

tn a single-user scenario, the Analyzer and Viewer can be used by the same 
person to set up data for analysis, and view the results. 

Both the Analyzer as well as the Viewer can be executed from within tools that 
developers as well as end-users are already familiar with, including web 
browsers and popular visual development tools and database and spreadsheet 
applications. Thus, there is no need for re-leaming a complex user interface. 

1.1 References 

1. {AF Paper) 

2 The VirtualMiner Analyzer 

2.1 Introduction 

The VirtualMiner Analyzer enables a designer to: 

• Set up raw data to enable viewing, filtering and querying 

• Set up one or more Data Mining analyses on the data (the Analyzer 
uses the Attribute Focusing algorithm for mining data [1]) 

2.2 Analyzer Execution 

The VirtualMiner Analyzer should be a software component, capable of running 
within several environments, viz. 

• Web browsers, such as Microsoft Intemet Explorer and Netscape 
Navigator 

• Visual Development tools that are capable of importing software 
components, such as Microsoft Visual C++, Microsoft Visual Basic, 
Microsoft Visual J++. Symantec Cafe, Borland C+-*-, Sun Java 
Workshop, Sun BeanBox, etc. 
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• Popular database utilities, such as Microsoft Access and Lotus 
Approach 

• Popular spreadsheet utilities, such as Microsoft Excel, and Lotus 1-2-3 
2.3 Inputs 

The VirtualMiner Analyzer should be able to handle raw data in the following 
formats: 

• ASCII Text Files 

• Microsoft Excel and Lotus 1-2-3 Spreadsheets 

• Local databases - Microsoft Access, Paradox, Dbase, Lotus Approach 

• All RDBMS servers - Oracle, Sybase. Informix, SQLServer. Gupta 
DB2 ' 

■ Tables embedded in HTML files 



2.4 Processing 

The processing by the VirtualMiner Analyzer achieves the following purposes: 

• Enables fast OLAP queries on the raw data 

. • Enables fast and easy Data .Mining queries on the raw data 

The Analyzer should read the raw data, and process it, using the following 
parameters provided by the user: 

• Column(s) of interest for OLAP queries 

• Data Mining Questions, each question consisting of : 
- Variable(s) of interest 

" Decision Variable 

• Numeric Attribute, if any 

• Cutoff 



(The User Interface for collecting these parameters is described in Section 2.6) 

2.5 Outputs 

In order to achieve the aims of fast and easy querying on the raw data, it may be 
necessary for the Analyzer to convert the raw data into a format more amenable 
to fast querying, in such a situation, the processed data should be stored aiono 
with the raw data, and linked to it in some manner, so that when the user views 
the raw data, the results of processing are also immediately available (refer 
Section 3 for more details) 

2.6 User Interface 

The User Interface of the Analyzer consists of a wizard, which guides the 
designer through the process of setting up the parameters for processing the 
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data. At each step, the designed is shown sample results, using the inputs he / 
she has already provided. (If no Inputs have been provided to the Analyzer, 
sample results shown should be based on "dummy" inputs). 

On each screen, there will be a 'Tutorial" button. Pressing this button would 
provide a short, animated tutorial on using the screen. The tutorial would provide 
help on using the contrc)ls on the screen, as well as suggest possible alternatives 
to the choices that the designer has made. This should be achieved in an easy- 
going, friendly and fun-to-use manner. 

The purpose of the interface is to collect the following information: 

• Type of input data (local database, ASCII text file, etc) 

• Name of file / database which contains data 

• Name of table in which data are stored (required if input is from 
database / HTML file) 

• Column(s) of interest for OLAP queries 

• Data Mining Questions, each question consisting of the following : 

• Variable(s) of interest 

• Decision Variable 

• Numeric Attribute, if any 
« Cutoff 



The User Interface consists of the following screens: 



2.B.1 Screen 1 ; Welcome Screen 

This screen is invoked when the user starts the Analyzer, it displays a suitable 
welcome message, informing the designer of the process he / she is about to go 
through. The screen contains the following controls: 

1. Cancel Button 

Pressing this button will cause the Analyzer to exit 

2. Next Button 

Pressing this button will cause Screen 1 to close, to be replaced by Screen 2 
2.6.2 Screen 2 : Dat^ Source Screen 

This screen is invoked when the user presses the Next button on Screen 1 , or 
the Previous button on Screen 3. It contains the following controls: 

1. ASCII Text File Option 

The designer can select this option if the raw data are contained in an ASCII text 
file 
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2. HTML File Option 

The designer can select this option if the raw data are contained in an HTML file. 

3. Local Database Option 

The designer can select this option if the raw data are contained in a table on a 
local database (Microsoft Access / Lotus Approach), Note that "local" in this 
context implies a "non-server" database. The database file itself could be located 
• on a fiieserver on a network. However, it is accessed as if It were a local file, and 
not through a database server, such as Sybase or Oracle, 

4. Database server Option 

The designer can select this option if the raw data are contained in a table on a 
database server. 

5. Spreadsheet Option 

The designer can select this option if the raw data are contained in a Microsoft 
Excel / Lotus 1-2-3 spreadsheet 

6. N^)d Button 

This button will be enabled if the data source type has been selected. Pressing 
this button will cause Screen 2 to dose. One of the following actions is taken : 

• If the designer selected the Database Serv^er option, Screen 4 is displayed 

• For all other options. Screen 3 is displayed 

7. Previous Button 

Pressing this button will cause Screen 2 to close, to be replaced by Screen ! 

8. Cancel Button 

Pressing this button will cause the Analyzer to exit 

9. Tutorial Button 

Pressing this button will cause the tutorial for this screen to start, 
2.6.3 Screen 3 : File Select Dialog Box 

This is the standard system File Select Dialog Box. It is displayed if the user has 
selected any option other than the Database server option on Screen 2. Using 
this screen, the designer should be able to select the file where the data are 
stored. (Depending on the selection in Screen 2, she would be able to select an 
ASCII text file, an MS Access or Lotus Approach database, an MS Excel or Lotus 
1-2-3 spreadsheet, or an HTML file). 

Once the designer selects a file, and closes this dialog box, one of the following 
actions is taken: 

• If the designer selected an ASCII text file, Screen 5 is displayed 

• If the designer selected a local database or HTML file, Screen 6 is 
displayed 
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» If the designer selected a spreadsheet, Screen 7 is displayed 

If the designer closes this dialog box without selecting a file, control returns to 
Screen 2. 

2.6.4 Screen 4 : ODBC Login Screen 

This screen is the standard system ODBC Login screen for selecting a data 
source from a database sery/er. It is displayed if the designer selects a Database 
sen/er as the data source in Screen 2. The designer will be able to select any 
ODBC source for the data. 

Upon successful login, this screen will close, to be replaced by Screen 1. 

2.6.5 Screen 5 : ASCII Text File Import Options Screen 

This screen is displayed if the designer has selected an ASCII text file as the 
data source. The designer can specify inaport options for the text file using this 
screen. This screen has the following controls: 

7. F/e/of Separator drop-down list 

This list contains possible options for the field separator character. The program 
should guess the n^ost appropriate character, which should be selected by 
default. 

2. Text Delfmrt&r drop-down list 

This list contains possible options for the text delmiter character. The program 
should guess the most appropriate character, which should be selected bv 
default. 

3. Next Button 

Pressing this button causes Screen 5 to close, and Screen 7 to be displayed 

4. Previous Button 

Pressing thus button causes Screen 5 to close, and Screen 2 to be displayed. 

5. Cancel Button 

Pressing this button causes the Analyzer to exit 

6. Tutorial Button 

Pressing this button causes the tutorial for this screen to start 

2.6.6 Screen 6 : Table Select Screen 

This screen is displayed if the designer selected a local database, database 
server or HTML file as the data source. If enables the designer to select a table 
from which the data are .imported. 
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This screen contains the following controls: 

1. Select Table List 

This Is a list of the tables in the data source selected by the designer, of which 
she can select one. 

2. Next Button 

Pressing this button causes Screen 6 to close, to be replaced by Screen 7 

3. Previous Button 

Pressing this button causes Screen 6 to close, to be replaced by Screen 2. 

4. Cancel Button 

Pressing this button causes the Analyzer to exit 

5. Tutorlel Button 

Pressing this button causes the tutorial for this screen to start 
2.6.7 Screen 7 : Selecting an analysis template 

IeLVo^TR^^^^'H''f -f ^'^"^''^ ^^'^^ ^ pre-defined analysis template (refer 
Section 2.6.15 for details on creating an analysis template). In this manner thp 

tTJZ: ""^'y^*^ ^'^^'^^t ^^les of the sTmTtype fo? 

example, sales figures of different months) ^ 

This screen contains the following controls: 

1. Define a new Analysis template Option (default) 

tl^nil?^ II"f " '^P"^^ ^^^^ ^^='9ner does not wish to use a pre-deftned 
template, but wants to define a new analysis template. 

2. Analyze using a saved template Option 

temS '"^^"^^ ^''^^ designer wishes to use a predefined 

3. Next Button 

Pressing this button causes Screen 7 to close. If the designer selected the fir.t 
selected Option 2 (Analyze using a saved template). Screen 8 is flayed 

4. Previous Button 

Pressing this button causes Screen 7 to close, to be replaced by Screen 2. 

5. Cancel Button 

Pressing this button causes the Analyzer to exit. 
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6. Tutorial Button 

Pressing this button causes the tutorial for this screen to start. 

2.6.8 Screen 8 : Select Analysis Template Dialog Box 

Iel^rsav?;^^L^^^^^^^^ '-^^^ - the .esi.n. to 

Sc?een&%^t^" ^"^'^^'^ ^^^P'^*^ ^oses this dialog box, 

ISL'T'"' "''^°^t selecting a file, control returns to 

2.6.9 Screen 9 : Selecting columns for OLAP queries 

1- Column List 

The. columns in the selected table arpfiieniax.o^ 

more columns from this table ^'^'P'ayed. The designer can select one or 

2. Sample OLAP Query Textbox 

JX™™:;rp: s'dSJS'^r"^ ^^^^ 

selected by the desigrar ehangl(^f ' *e colun;n(5) 

3. Next Button 

P-^ssing this .ut.cn causes this sc^en to Cosa, .„d Screen 10 .o be displayed. 

4. Previous Button 

Pressing this button causes this .creen to close, to be replaced by Screen 7. 

5. Cancel Button 

Pressing this button causes the Analyzer to exit. 

6. Tutorial Button 

Pressing this button causes the Tutorial for this screen to start. 

2^6 1 0 Screen 10 : Selecting a Numeric Variable for Data Minmc 

Jd3ta mining 

1. Numeric Variable List 
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A list Of the numeric variables in the selected table is displayed. The designer 
. can select one numeric variable from this table. 

fNs^r£xTp,ar-C"„^^ -ult involving th. nu..Ho v.riabie 
Sis by ma deligner. (T^e sample result displayed changes as and when the 
numeric variable selected by the designer changes) 

PrBS«°?utton causes this screen to close, tn be replaced by Screen 11. 

PreXth^n causes this screen to dose, to be replaced by Screen 9. 

5. Cancel Button 

Pressing this button causes the Analyzer to exit. 

6. Tutorial Button 

Pressing this button causes the tutorial for this screen to start. 

2 6 11 Screen 1 1 : Selecting Variables of interest for Data Mining 

This screen enables the designer to select the variable(s) for data mining. It 
contains the follow/ing controls: 

i * Vanai^te List 

This is a list of the variables in the selected table. The designer can select one or 
more variables from this list 

2. Sample Data Mining result Textbox . . ««h 
This textbox displays a sample data mining result involving the va iabl6(s) and 

numeric variable selected by the designer. The ^^-^P'f/^^^^ j^'f ^Xe^^^^^^^^^ 
change as and when the designer makes changes in the vanable(s) selected for 

analysis. 

3, Next Button c^,-aar. i •> 
Pressing this button causes this screen to close, to be replaced by Screen 12. 

A Previous Button , . n h n 

Pressing this button causes this screen to close, to be replaced by Screen 1 0. 

5. Cancel Button 

Pressing this button causes the Analyzer to exit. 

6. Tutorial Button 

Pressing this button causes the tutorial for this screen to start 
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2,6,12 Screen 12 ; Selecting a decision variable for Data Mining 

This screen enables the designer to select a decision variable for data mining. It 
contains the following controls: 

1 Decision Variable List 

This list contains potential candidates for the decision variable. The designer can 
select one variable from this list. 

2. Sample Data Mining result textbox 

This textbox displays a sample data mining result involving the variab[e(s) 
selected for data mining, the numeric variable and the decision variable. The 
sample result displayed should change as and when the designer changes the 
decision variable selection, 

3. Next Button 

Pressing this button causes this screen to close, to be replaced by Screen 13. 

4. Previous Button 

Pressing this button causes this screen to close, to be replaced by Screen 1 1 . 

5. Cancel Button 

Pressing this button causes the Analyzer to- exit. 

6. Tutorial Button 

Pressing this button causes the tutorial for this screen to start. 



2.6.13 Screen 13 : Selecting the cutoff for Data Mining 

This screen enables the designer to set the cutoff for data analysis. It contains 
the following controls: 

7. Cutoff textbox 

J^^d^esigner can enter the desired cutoff in this textbox. The default value is 

2. Next Button 

Pressing this button causes this screen to close, to be replaced by Screen 14. 

3. Previous Button 

Pressing this button causes this screen to close, to be replaced by Screen 12. 

4. Cancel Button 

Pressing this button causes the Analyzer to exit. 
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5, Tutorial Button 

Pressing this button causes the tutorial forthis screen to start. 

2-6.14 Screen 14 ; Analysis Description 

This screen enables the designer to ^nter s titiA fnr fh^ - • 

1- Analysis Description Textbox 

Tha designer can anter a descnption for the data mining analysis in this textbox. 

2 Define another data mining question Checkbox 

lutst^'""' '''' '''' ' ^^^^ to define another data mining 

3. Next Button 

4. Previous Button 

Pressing this button causes this screen to dose, to be replaced by Screen 13. 

5- Cancel Button 

Pressing this button causes the tutorial for this screen to start. 
6. Tutorial Button 

Pressing this button causes the tutorial for this screen to start. 
2^6.15 Screen 15 : Saving the analysis template 

^^^Z^^^l^^i " the designer to save 

When this dialog box is dosed. Screen 16 Is displayed, 
2.6.16 Screen 16 : Finish 

This screen displays a "Finish- message. It contains the following controls: 

1. Finish Button 

Pressing this button causes the Analvypr tr> e+^rf 

..e processing, the Analyzer sho^uW '^sp^a? r:^bTrCel3'''4sTaget ' 

2. Previous Button 
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Pressing this button causes this screen to close, to be replaced by Screen 14. 
3. Cancel Button ■ 

Pressing this button causes the Analyzer to exit without processing the data. 

2-7 The Analyzer Online Tutorial 
2,8 Issues 
2.8.1 Portability 

2- S-2 Data Storage 

2.8.3 Multiple Users 

2.8.4 Internationalization 

3 The VirtualMiner Viewer 

3.1 User Interface 

3.2 Data Mining and OLAP query results 
3.2.1 Collation 

3- 2.2 Printing / faxing 

3.2.3 E-mail 

3.2.4 Publishing to website 

3.3 The Viewer Online Tutorial 

4 Typical Usage Scenario 

Blazer Engine 
OLAP Support 
AF Support 

Categorical attributes only 
Cutoff for size of dataset 
N-P Complete problem (cliques) 
HTML 
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